


JOURNAL 
OFe*BALTIC 
SCIENCE 
EDUCATION 


ISSN 1648-3898  jrisi/ 
ISSN 2538-7138 jonine, 


Abstract. This research aimed to evaluate 
the students’ conceptual understanding and 
to diagnose the students’ preconceptions 

in elaborating the particle characteristics 

of matter by development of diagnostic 
instrument as well as Rasch model response 
pattern analysis approach. Data were 
acquired by 25 multiple-choice written test 
items distributed to 987 students in North 
Sulawesi, Indonesia. Analysis on diagnostic 
test items response pattern was conducted 
in three steps: 1) conversion of raw score to 
a homogenous interval unit and effective- 
ness analysis of measurement instruments; 
2) measurement of disparity of students’ 
conceptual understanding; and 3) diagnosis 
of students’ preconception by estimation of 
item response pattern. The result generated 
information on the diagnostic and summa- 
tive measurement on students’ conceptual 
understanding in elaborating the topic; 
information also acts as empirical evidence 
on the measurements reliability and validity. 
Moreover, the result discovered a significant 
disparity between students’ conceptual 
understanding based on their educational 
level. It was found that the distractor item 
response pattern tended to be consistent, 
indicating a certain tendency of resistant 
preconception pattern. The findings are 
expected to be arecommendation for future 
researchers and educational practitioners 
that integrate diagnostic and summative 
measurement with Rasch model in evaluat- 
ing conceptual understanding and diagnos- 
ing misconception. 
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Introduction 


Central to the notion of learning about characteristics of a particle 
of matter is the process of developing an understanding on abstract con- 
cepts (Johnstone, 1991) without directly interacting with the object/fact 
(Stojanovska et al., 2012); therefore it is considered a difficult subject for 
the students to learn. Echoing this, the disparity in understanding is almost 
inevitable (Kapici & Akcay, 2016) since different students may develop their 
own distinctive way of understanding a concept (Yildirir & Demirkol, 2018). 
The idea is also coined by experts as misconception (Johnstone, 2006, 2010; 
Taber, 2015), or alternative framework and preconception (Lu & Bi, 2016). The 
experts have discovered that students always have their own preconception 
that is not in line with scientific concepts (Alamina & Etokeren, 2018; Yasar et 
al., 2014); therefore, one needs to conduct identification and improvement 
on the conceptual learning (Allen, 2014; Soeharto et al., 2019). 

In diagnosing preconceptions, several researchers have developed 
diagnostic instruments in different mechanisms (McClary & Bretz, 2012), i.e., 
conceptual map, essay test, interview, essay test with interview, or multiple- 
choice test (Femintasari et al., 2015). Two-step multiple choice diagnostic test 
(Adadan & Savasci, 2012; Chandrasegaran et al., 2007; Treagust, 1988; Tuysuz, 
2009) is preferred due to its ability to diagnose preconception and describe 
the underlying reasons. The instrument is indeed considered qualitatively 
effective in elaborating differences in students’ thought processes; however, 
it does not provide summative measurement features due to lack of internal 
consistency and the instrument's unidimensionality (Lu & Bi, 2016). In addition 
to that, the measurement conclusion generated is considered weak due to 
extracted from analysis on the raw score (Sumintono, 2018) 
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Studies on preconception have found that the concept is somewhat resistant. In the early 2000s, it is dis- 
covered that students’ preconceptions persisted even when they already undergo formal education experience 
(Hoe & Subramaniam, 2016). Preconception can also change along with the development of students’ conceptual 
understanding; it also varies in different levels of understanding (Aktan, 2013). If one conducts a two-step test and 
raw score analysis approach to diagnose resistant preconception, the result generated will only provide limited 
feedback information (Sumintono, 2018) due to the instrument's limitation in measuring students’ conceptual 
understanding. Instead of supporting, the information will only make it harder for teachers to implement proper 
instructional decisions (Wilson, 2008). 

During the middle of the 2000s, the Rasch model analysis was commonly used in studies of chemistry edu- 
cation (Herrmann-Abell & DeBoer, 2011; Liu, 2012; Wein et al., 2012). The approach provides a testing apparatus 
that integrates diagnostic and summative measurement. Recently, this approach is used to develop formative 
assessment with the intention to conduct learning construction mapping, e.g., Measuring the students’ way of 
constructing their understanding process (Hadenfeldt et al., 2013). It is worth to note, however, that there are 
studies that integrate diagnostic and summative measurement with a different approach (Hoe & Subramaniam, 
2016); despite that, trends in chemistry education studies highlight that diagnostic-summative measurement 
by Rasch model analysis is more common to be carried out (Laliyo et al., 2019; Lu & Bi, 2016). 


Research Problem 


The characteristics of a particle of matter is a fundamental concept in chemistry, usually taught in middle 
education level. Adequate comprehension regarding the particle characteristics of matter both in macroscopic 
and microscopic level is essential as the knowledge basis in understanding more advanced topics such as the 
concept of atoms and molecules as the submicroscopic component that is invisible to plain eyesight but exists in 
all real-world phenomena (Cheng, 2018; Ozmen, 2011; Yildirir & Demirkol, 2018). The fact signifies the relevance 
and reasoning of complexity in chemistry learning that is considered difficult for both students and teachers 
to conduct (Alamina & Etokeren, 2018). In simpler terms, to ensure that the chemistry learning is conducted 
effectively, one requires to nurture students’ comprehensive understanding regarding particle characteristics 
of matter and its change of state. 

To evaluate the students’ conceptual understanding on the aforementioned topic, one also needs to mea- 
sure the students’ capability in interpreting particle state during change process of a matter’s form (Alamina 
& Etokeren, 2018; Barbera, 2013; Boz, 2006; Cheng, 2018; Gabel, 1993; Hadenfeldt et al., 2013; Kapici & Akcay, 
2016; Kind, 2004; Naah & Sanger, 2012; Ozalp & Kahvecib, 2015; Ozmen, 2011; Renstrom et al., 1990; Slapnicar 
et al., 2017; Stojanovska et al., 2012; Yildirir & Demirkol, 2018). Research studies on particle characteristics and 
changes of matter generally employ diagnostic instruments in the form of essay tests and/or essays followed by 
interview; the instruments are further analyzed based on raw score results. The approach is considered inefficient 
and somewhat lacked accuracy in measuring students’ conceptual understanding and misconception pattern. 
Despite its ineffectiveness, the conventional method is used by most teachers in Indonesia to measure and 
determine students’ learning progress. The teachers argue that measuring the students’ raw score is effective in 
determining how far the students have progressed in the learning process. The students’ raw score is regarded 
by many as an early premature indication regarding the measured variable and is not eligible to be the final 
measurement indicator due to its temporary nature. In addition to that, regarding the decision-making process, 
the raw score contains only limited information for it to be treated as reference (He et al., 2016; Sumintono & 
Widhiarso, 2015) 


Research Focus 


The research focuses on developing a diagnostic instrument that integrates measurement of conceptual 
understanding and diagnosis of students’ preconceptions regarding the aforementioned topic by the approach 
of Rasch model item response pattern analysis. The analysis employs different test apparatuses to provide 
extensive information for practitioners and researchers in science education in evaluating students’ learning 
progress in different topics. 
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Research Questions 


This research aimed to figure out the following questions: 1) How is the effectiveness of measurement instru- 
ment to evaluate the students’ conceptual understanding and diagnose their preconceptions on the characteristics 
of a particle of matter? 2) Is there any significant difference between students in elaborating on the aforemen- 
tioned topic based on their educational level? 3) How is the pattern of students’ conceptual understanding and 
preconception regarding the topic? 


Research Methodology 
General Background 


The descriptive-quantitative research employed a non-experimental approach, in which the students’ con- 
ceptual understanding in explaining the characteristics of a particle of matter was treated as the measurable vari- 
able. Prior to conducting the research, it was ensured that the students already experience formal learning of the 
aforementioned topic. The researchers did not conduct any intervention on the learning process or the learning 
material. In other words, no treatment was implemented to the students for them to be able to answer all test 
items in the measurement instrument. 

The data collection step was implemented for four months in the even semester of the 2019-2020 academic 
year; the process was conducted after obtaining approval from the Government of Province of Gorontalo and 
heads of universities in the Northern part of Sulawesi, Indonesia. Moreover, the schools’ and parents’ approval was 
obtained in cooperation with the school committee. The school administrators were willing to facilitate the data 
collection process that adjusted with the schedule. 


Respondents 
The respondents were 987 people consisting of students of eleventh grade from eight lower-secondary 
schools well as university students of the chemistry department in Northern Sulawesi, Indonesia. The distribution 


of respondents is displayed in Table 1 below. 


Table 1 
Demographic profile of respondents (N=947) 


Demography Code Respondents Percentage (%) 
Gender 

Male M 320 67.68 
Female F 667 32.42 

Education level 
X Class students M 168 17.09 
XI Class students N 473 47.92 
XII Class students O 186 18.84 
University students from the chemistry department P 160 


The respondents were chosen randomly and have voluntarily agreed to participate in the research. In addition, 
they received no learning treatment and other special treatments that allow them to complete the measurement 
instrument. Students were asked to write down their responses in the answer sheet; the process was supervised by 
teachers in the respective schools and lecturers in the respective university. All students were instructed to answer 
all questions in the instruments within 45 minutes. All instrument sheets and answer sheets were collected by the 
researchers shortly after the session ended; it was ensured that the numbers of instruments matched the numbers 
of participants. For the certainty in ethical consideration, permission was obtained from the school administration 
after coordinating with students’ parents through the school committee. This process was conducted before the 


826 


RSs https://doi.org/10.33225/jbse/20.19.824 


Journal of Baltic Science Education, Vol. 19, No. 5, 2020 


ISSN 1648-3898 /Print/ ANALYTIC APPROACH OF RESPONSE PATTERN OF DIAGNOSTIC TEST ITEMS IN EVALUATING 
STUDENTS’ CONCEPTUAL UNDERSTANDING OF CHARACTERISTICS OF PARTICLE OF MATTER 
ISSN 2538-7 138 /oniine/ (PP. 824-841) 


students were invited to participate in research. Permission for the students was obtained from the department 
leaders of the university, and student written statements. All students were told that the confidentiality of their 
identity was fully guaranteed, and the results of the study would only be used for research purposes. 


Instrument and Procedures Development 


The design process refers to a recommendation by Wilson (2005), which consists of four key steps: definition 
of construct map, item design, result blank, and measurement model. 

Phase 1: Definition of construct map. The map offers a substantive definition of measured constructs; the 
more constructs measured, the constructs’ level will vary qualitatively (Wilson, 2009). In simpler words, it aims to 
develop the students’ understanding map to measure the students’ progress (Wilson, 2012). The instrument involved 
variables, i.e., the students conceptual understanding and preconception in elaborating the characteristics of a 
particle of matter; it was conducted in accordance with the Curriculum Standard of Chemistry Subject in Tenth 
Grade in Indonesia, as presented in Table 2. 


Table 2 
Conceptual Understanding Level 


Level 3 The students are able to connect between characteristics of a particle of matter in macroscopic and submicroscopic level 


Phenomenon Evaporation: item Q6/Bubble 

10. Preconception Air bubble consists of Hydrogen and Oxygen particles 

9. Preconception Air bubble is water-soluble 

Phenomenon Condensation: item Q5/Dew 

8. Preconception Water drops come from melting ice that penetrates the glass wall 

7. Preconception Water drops are the result of the reaction between ice and air nearby the glass 


Level 2 The students are able to determine SMRs diagram of particle structure during a change of form: item Q11/SMRs/SL; Q12/ 
SMRs/LG; Q13/SMRs/GS; Q25/SMRs/GG 
6. Preconception The SMRs diagram of particle structure follows the physical form of matter 


The SMRs diagram of O02 molecule shape undergoes change as a result of an increase in the volume of the 


5. Preconception 
container. 


Level 1 The students are able to determine the characteristics of a particle of matter during the change process of matter’s form. 
The particle size of matter changes into (large/small) as a result of change in matter form: item Q1/PS/SL; Q7/PS/ 
LG; Q14/PS/LG; Q18/PS/SG; Q22/PS/GG 


The particle mass of matter changes into (large/small) due to change in matter form: item Q2/PM/SL; Q8/PM/LG; 
Q15/PM/LG; Q19/PM/SG; Q24/PM/GG 


Distance between matter particles changes into (faster/slower) due to change in matter form: item Q3/DP/SL; Q9/ 
DP/SL; Q16/DP/LG; Q20/DP/SG; Q23/PM/GG 


Motion between matter particles changes into (dense/loose) due to change in matter form: item item Q4/PMo/SL; 
Q10/PMo/LG; Q17/PMo/LG; Q21/PMo/SG 


4. Preconception 
3. Preconception 
2. Preconception 


1. Preconception 


Variation in conceptual understanding level illustrates the development process of the students’ conceptual 
understanding. In the first level, the students were asked to determine particle characteristics (size, mass, Motion, 
and distance) in the change process of matter form. In the second level, the students were asked to determine the 
submicroscopic representation diagram of particle structure. Further, in the third level, the students were asked 
to connect between characteristics of a particle of matter at the macroscopic and submicroscopic level. In each 
level, the construct map also features the students’ tendency of preconception. 

Phase 2: item design and evaluation The phase involved the determination process of items to be used in 
acquiring evidence of students’ construct understanding regarding the construct map (Wilson, 2005). Certain 
items may have a different extent of effectiveness to measure students’ conceptual understanding (Sadler, 1999); 
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however, multiple choices item is considered more practical and effective (Wilson, 2008). The instrument of concept 
understanding test of the particle (or TPKP) is adapted from multiple-choice instruments by (Herrmann-Abell & 
DeBoer, 2011). Each item consists of two distractor answer choices and one open answer choice. The distractor 
answer choices are designed by referring to the common preconceptions by the students (see Table 2) as logical 
choices to distract the students from the correct one. The distractors function to emphasize the item diagnostic 
strength (Sadler, 1998). Some of the items are adopted from previous studies Osborne & Cosgrove (1983), Renstrom 
et al., (1990); Devetak et al., (2004); Toth & Kiss (2006); Davidowitz et al., (2010); Devetak & Glazar (2010); Slapnicar 
et al., (2017) and (Yildirir & Demirkol, 2018). 


Figure 1 
Sample of item Q1/PS/SL design 


Glass (a) contains ice chunks; glass (b) contains 

melting ice chunks. How is the size of water particle in 

solid form (ice) compared to that in liquid form? 

a. Size of a water particle in solid form > a water particle in 
liquid form. 

b. Size of a water particle in solid form < a water particle in 
liquid form. 

c. Size of a water particle in solid form = a water particle in 
liquid form. 

d. Other answers 





Figure 1 displays a sample of item Q1/PS/SL design, in which Q1 is the number of item 1, PS is particle size, 
and SL is solid-liquid. The item measures student's capability in determining particle size in form change from solid 
to liquid. The choice A and B are distractors, the correct choice is C, and choice D is for other answers students 
may fill if the existing answer choices are not in accordance with their initial knowledge. Every correct answer was 
given mark 1, and wrong answers got 0 mark. Each student only has a slight probability of 0.25 in choosing the 
right answer. The students will pick what they think the right answer based on their understanding. If the distractor 
item choice functions well, the students will not be able to predict the correct answer. 

Phase 3: design of result blank, i.e., the correlation between construct map and items (Wilson, 2005). This 
phase aimed to identify whether the answer the students pick correlates with their conceptual understanding; 
in simpler terms, it was intended to elaborate the conformity between the variable contents being measured. In 
order to elaborate on the previous aspect, the TPKP instrument was validated by three independent experts and 
tested to the students to acquire their feedback. The process acquired 25 items of TPKP. Prior to the data collection 
process, it was ensured that all students had received formal education on the characteristics of a particle of mat- 
ter and its changes. The students’ response towards the instrument was inputted manually by the written answer 
sheet. The test was supervised by the teachers in school by referring to the agreed permission and duration. Each 
student was required to finish all test items within the allocated duration of 45 minutes. The instrument sheets 
were further collected, and checking process was conducted to ensure that the amount of instrument sheet was 
the same with participating students. 

Phase 4: Rasch model analysis approach. The analysis integrates algorithm as a result of probabilistic expecta- 
tion of item ‘i’and student’‘n; as: The statement is the probability of student n in item i to result in the correct answer 
(x = 1); with student ability, 8n, and item difficulty level (Bond & Fox, 2015). The above equation was simplified 
by inserting logarithm function, into, so that the probability of picking the right answer equals to student’s abil- 
ity subtracted by item difficulty level. The student (person) and item units were considered on the same interval 
scale and were independent of each other. The students’ ability level and item difficulty level were measured in 
the logarithm unit, namely odds or log that variates from -00 to +00 (Herrmann-Abell & DeBoer, 2011; Sumintono 
& Widhiarso, 2015). The instrument efficiency, when compared to the item distribution towards item difficult level 
with distribution of student's ability level, was quantifiable in order to measure the students’ conceptual understand- 
ing. In addition, the student’s understanding level was differentiated based on the item size. The previous steps 
highlighted the main difference of Rasch model analysis when compared to the raw score-based conventional one; 
the latter lacks accuracy in evaluating students’ ability observed from different item difficulty level (Herrmann-Abell 
& DeBoer, 2011; Lu & Bi, 2016; Sumintono & Widhiarso, 2015). 
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Data Analysis 


The research employed WINSTEPS version 3.75 software to convert raw data into interval data (Bond & Fox, 
2015; Linacre, 2012). The conversion result acted as the calibration of data on the student's ability level and item 
difficulty level within the same interval measurement. Moreover, the analysis on diagnostic test items response 
pattern was conducted in three steps: 1) conversion of raw score to a homogenous interval unit and effective- 
ness analysis of measurement instruments; 2) measurement of disparity of students’ conceptual understanding 
by Differential Item Functioning (DIF) item test; and 3) diagnosis of students’ preconception by estimation of item 
response pattern through option probability curve test. 


Research Results 
Effectiveness of Measuring Instruments 
Person and Item Reliability. The first step to elaborate on the effectiveness of measuring instruments was by 
measuring the person and item reliability. This was conducted to gather information to what extent the measure- 
ment produces consistent information in displaying latent trait or the unidimensionality of the measured variable 


(Sumintono & Widhiarso, 2015). The analysis result is presented in the form of a statistical summary (Table 3). 


Table 3 
Summary of fit statistics 


INFIT OUTFIT 
Parameter (N) Measure = _—————_ Separation Reliability SD  KR-20 
MNSQ ZSTD MNSQ ZSTD 
Person (987) -.34 1.00 -.11 1.02 -.1 1:55 1 88 . 
Items (25) 00 1.00 -.75 1.02 a4 8.18 99 60 


The above table indicates that the person reliability value of 0.71 is equivalent to the person separation index 
value of 1.55. This is to say that the consistency of students’ response towards the test is deemed good. In addi- 
tion to that, it is generated that the Cronbach Alpha Coefficient (KR-20) value is 0.72, signifying good interaction 
between students and the test. This further indicates strong correlation between the students’ response towards 
the item, in the context that the students’ knowledge tends to be non-fragmented, enabling it to be measured 
(Adams & Wieman, 2011). To the researchers and educational practitioners, such information is essential to pre- 
pare for follow-up plans and development of students’ ability (Wei et al., 2012). Moreover, the result generated a 
relatively high value of item separation index of 8.18 that was equivalent to the item reliability value of 0.99. This 
indicated very good item consistency, or the item was deemed capable of meeting the unidimensionality criteria. 
In other words, the item performed very good in defining the measured variable. This was confirmed by the infit 
and outfit value result, in which most of the items were in the acceptable range for the multiple-choice test (Bond 
& Fox, 2015; Herrmann-Abell & DeBoer, 2011). 

Figure 2 displays the graph of measurement information in order to show the measurement reliability. The 
higher the tip of information function graph, the measurement reliability value is likely to increase. In the interme- 
diate level of students’ ability (-3.0 logit up to +3.0 logit), the measurement information is in very high spot. This 
indicates that the TPKP instrument is capable of producing optimal information to students with an intermediate 
level of ability. Such a result means that the instrument possesses high measurement reliability (Bond & Fox, 2015; 
Kim & Wilson, 2019). 
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Figure 2 


Function of Measurement Information 


Test Information Function 


Information 


Measure 


Note: M = X Class students, N = XI Class students, O = XII Class students, and P = University students from chemistry department 


Validity. The next step was to measure the item validity by Fit item test to ensure that all items fit with the Rasch 
model. The process was aimed to identify whether or not the test item could measure the aspects that intended to 
be measured, or test validity (Linacre, 2012; Sumintono & Widhiarso, 2014). The criteria used comprise outfit means- 
square (MNSQ): 0.5 < y < 1.5; outfit z-standard: -2.0 < Z < + 2.0, as well as point measure correlation (PTMEA Corr). 
The PTMEA Corr is the correlation between the score of item and person measure that is required to be a positive 
value and not approaching zero (Bond & Fox, 2015). The PTMEA Corr criteria: 0.4 < x < 0.8. If all three criteria are 
not met, the item is not good enough and needs further elaboration (Boone et al., 2014). Both Outfit MNSQ and 
Infit MNSQ were sensitive chi-squares in detecting outlier response pattern. There were two outlier responses: the 
right response, guessed by the students with low ability in item with high difficulty level, or the wrong response 
due to the high-ability students’ carelessness in items with a low difficulty level. The expected ideal MNSQ value 
is 1.0. The analysis result on item appropriateness is displayed in Table 4 as follows: 


Table 4 
Item Statistics: Misfit Order 


INFIT OUTFIT 
Item Measure SS PTMEA Corr 
MNSQ ZSTD MNSQ ZSTD 
Q6/Bubble .60 1.26 7.0 1.40 i 07 
Q2/PM/SL .88 1.16 3.0 1.27 4.4 18 
Q15/PM/LG .66 1.12 3.3 1.20 3.8 22 
Q14/PS/LG 97 1.03 8 1.18 2.8 .20 
Q18/PS/SG 11 1.07 1.8 1.15 2.9 .28 
Q5/Dew -.63 1.06 2.6 1,15 oul WAL 
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INFIT OUTFIT 
Item Measure S——S__SJj}S[Mmoo_—aaaa_—uaauaaa»\_—_—__—_——————-7 PTMEA Corr 
MNSQ ZSTD MNSQ ZSTD 
Q7IPS/LG 66 1.06 1.6 1.14 2.8 29 
Q8/PM/LG 65 1.07 2.0 1.11 2.1 28 
Q1/PS/SL 19 1.00 -.1 1.06 ca 35 
Q24/PM/GG 29 1.04 1.4 1.04 14 33 
Q19/PM/SG LE -.3 1.03 6 36 36 
Q3/DP/SL --44 1.01 oO 1.00 -.1 34 
Q10/PMo/LG -.07 98 -.8 98 “9 38 
Q13/SMRs/GS -.24 98 “1.1 98 -.6 38 
Q9/DP/LG -.32 97 -1.6 95 “1.6 39 
Q4/PMo/SL -.66 96 -2.0 93 -18 39 
Q25/SMRs/GG -.68 94 -2.9 91 -2.4 Al 
Q16/DP/LG -.47 94 3.1 91 -2.8 A2 
Q23/DP/GG --44 92 3.7 93 -2.1 A3 
Q12/SMRs/LG -.63 92 -3.8 87 -3.5 A4 
Q21/PMo/SG -.66 92 -4.0 89 -2.9 43 
Q17/PMo/LG --71 94 -4.4 87 -3.5 A4 
Q22/PS/GG -.07 90 -4.4 87 -3.9 AT 
Q11/SMRs/SL -.27 90 -4.9 87 -4.0 AT 
Q20/DP/SG -.65 86 -6.6 83 -4.6 AQ 


From the previous Item Statistics, it is generated that all items meet the Outfit MNSA criteria and no negative 
PTMEA Corr occurs. This means that all items are not deviant, appropriate, and valid. Despite some items do not 
meet one of the criteria, this by no means decreases the quality of the items. For instance, item (Q6/Bubble, Q2/PM/ 
SL, and Q15/PM/LG) do not meet the criteria of Outfit Z Standard and PTMEA Corr; item (Q1/PS/SL, Q24/PM/GG and 
Q19/PM/SG) do not meet the criteria of PTMEA Corr; and item (Q25/SMRs/GG, Q16/DP/LG, and Q23/DP/GG) do not 
meet the criteria of Outfit ZSTD; this is supposedly caused by large size of sample, or N > 500 (Boone et al., 2014). 

Wright Map: Person-Map-Item. The third step was to measure the consistency of item difficulty level and 
student’s ability test constructed in Table 2. The higher the item difficulty level, the higher also the student's ability 
level will result. Information of Wright Map: Person-Map-ltem is displayed in Figure 3. The previous Wright map 
generates that all instrument items encompass almost all the students’ ability. The map generates variance from 
students with very high ability (> 3.0 logit), to those with very low ability (< -2.0 logit) as well. In addition to that, 
disparity (in which there is no item that is appropriate with the student’s ability) was observed within the interval 
of -3.0 logit up to -0.5 logit and in the interval of +1.0 logit up to +3.7 logit. This signified that the information gen- 
erated within the interval range was somewhat limited and required further elaboration. On the other hand, the 
item difficulty level was mostly located in the interval of -1.0 logit up to +1.0 logit; moreover, the items tended to 
occur in the same difficulty level. The item Q14/PS/LG was the most difficult item with a logit of +0.97, while item 
Q17/Pmo/LG was the easiest item with logit of -0.71. 
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Figure 3 
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As observed from the differences in item size, some interesting cases were explained as follows: Firstly, the 
items in level 1: Q14/PS/LG (0.97) > Q1/PS/SL (0.79) > Q18/PS/PG (0.71) > Q7/PS/GG (0.66) were instead assumed 
by the students to possess different difficulty level. The items above, however, were more difficult than item Q6/ 
Bubble in level 3 (0.60). In other words, determining particle size was more difficult than explaining the particle 
characteristics of matter in the evaporation phenomenon. Secondly, the size of item Q5/Dew (-0.63) < item Q6/ 
Bubble; this indicated that it was harder for the students to elaborate on the particle characteristics of matter in 
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the evaporation phenomenon than in condensation phenomenon, despite that both items were in the same level. 
Thirdly, the size of following items: Q2/PM/SL (0.88) > Q19/PM/SG (0.77) > Q15/PM/LG (0.66) > Q8/PM/LG (0.65) 
> Q24/PM/GG in level 1 was larger compared to that of items Q13/SMRs/GS (-0.24) > Q11/SMRs/SL (-0.27) > Q12/ 
SMRs/LG (-0.63) > Q25/SMRs/GG (-0.68) in level 2. The finding illustrated that it was harder for the students to de- 
termine the particle mass than determining the submicrorepresentation (SMRs) diagram in different form changes 
of matter. The previous cases identified disparity in students’ conceptual understanding, signifying that the level 
of understanding in particle characteristics of the matter is relatively low. Overall, 80% of test item difficulty level is 
relatively parallel with the measured constructs. By that, the test possesses good construct validity (Blanc & Rojas, 
2018; Lu & Bi, 2016; Neumann et al., 2011). 


Disparity in Conceptual Understanding Level 


The next step was the measurement of disparity of students’ conceptual understanding in the focused topic 
based on educational level by Differential Item Functioning (DIF). 


Figure 4 
Person DIF plot based on educational level 
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Note: M =X Class students, N = XI Class students, O = XII Class students, and P = University students from chemistry department 


Figure 4 of DIF plot based on students’ educational level depicts that ten items are identified to possess 
significant disparity. Firstly, five curves approaching the upper limit are items with a high difficulty level (Q14/ 
PS/LG, Q2/PM/SL, Q15/PM/SG, Q24/PM/GG and Q6/Bubble); while five curves approaching the lower limit are 
items with a low difficulty level (Q20/DP/SG, Q21/PMo/SG, Q12/SMRs/LG, Q13/SMRs/GS, and Q5/Dew). Secondly, 
the item Q14/PS/LG (particle size in form change of liquid-gas), Q2/PM/SL (particle mass in form change of solid- 
liquid), and Q15/PM/SG (particle mass in change form of solid-gas) were deemed very hard by the students of XII 
class and the university students compared to students in X and XI class. Thirdly, the research discovered different 
results for item Q24/PM/GG and Q6/Bubble. The item Q24/PM/GG (particle mass of O2 in larger volume) and Q6/ 
Bubble (constructing elements of air bubbles during boiling process of water) were deemed very hard for X class 
students compared to students in XI and XII classes, as well as university students. Fourthly, the items Q20/DP/SG 


833 


—<—Ly) 


https://doi.org/10.33225/jbse/20.19.824 


Journal of Baltic Science Education, Vol. 19, No. 5, 2020 


ANALYTIC APPROAGH OF RESPONSE PATTERN OF DIAGNOSTIC TEST ITEMS IN EVALUATING ISSN 1648-3898 /Print/ 
STUDENTS’ GCGONCGEPTUAL UNDERSTANDING OF GHARACTERISTICS OF PARTIGLE OF MATTER 
(PP. 824-841) ISSN 2538-7 | 38 /Online/ 


(distance between particles in form change of solid-gas), Q21/PMo/SG (motion between particles in form change 
of solid-gas), Q12/SMRs/LG (SMRs diagram of particle in form change of liquid-gas), Q13/SMRs/GS (SMRs diagram 
of particle in change form of gas-liquid), and Q5/Dew (condensation) were deemed too easy for students in XI 
class and university students compared to the students in X and XI classes. 


Pattern of Conceptual Understanding and Preconception 


The analysis of the pattern of conceptual understanding and preconception employed an option probability 
curve test (Boone et al., 2014; Linacre, 2012). The option probability curve aims to display the probability of pick- 
ing every answer choice to elaborate on the performance level of all students in the measured items (Herrmann- 
Abell & DeBoer, 2011). The test relied on the principle that the curve of the correct answer will rise along with the 
decrease of the curve of distractor choices (Boone et al., 2014; Haladyna, 2004). For items that are influenced by 
distractor options, the curve produced tends to be non-parallel with the traditional monotonous item behavior 
(Sadler, 1998), for this reason, each answer choice was analyzed separately. 

The instrument provides four answer choices, thus resulting in four curves. Each curve displays the students’ 
comprehension. Students with low ability tended to pick distractor choice, while students whose high ability 
were more likely to prefer other preconceptions (Herrmann-Abell & DeBoer, 2011; Perera et al., 2018). Below is 
the elaboration of the pattern of students’ conceptual understanding and preconception based on four option 
probability curves. 


Figure 5 
(a) sample of item Q2/PM/SL, (b) option probability curve 
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Overall Student Performance 


The first example, i.e., the item Q2/PM/SL (0.88), is shown in Figure 5(a). The item measures students’ capabil- 
ity in determining particle size in form change from solid to liquid. The option probability curve is displayed in 
Figure 5(b). Students with the low ability (< 0.5 logit) tended to pick distractor choice B (mass of one particle of ice 
is smaller than the mass of one particle of water) or A (mass of one particle of ice is bigger than the mass of one 
particle of water). In addition, students with very low ability (< -1.0 logit) tended to pick D (other answers). Some 
students with relatively low ability (> -2.5 logit), however, picked the right answer C (mass of one particle of ice is 
similar to the mass of one particle of water). One can predict the response pattern of students with low ability, as 
the distractors A, B, and D contain third preconceptions in level 1 (see Table 2). The students possess the knowledge 
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that mass of particle of matter can change into larger or smaller size by observing the matter’s change of form. It 
is interesting to note that there are students with the high ability (>2.0) who picked B; this indicates the presence 
of resistant preconception. 


Figure 6 
(a) item Q8/PM/LG; (b) option probability curve 
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Overall Student Performance 


The second sample or item Q8/PM/LG (0.65) is shown in figure 6(a) as the item to measure students’ ability 
to determine the mass of the particle in form change of liquid-gas. The option probability curve is displayed in 
Figure 6(b). The curve of distractor B (mass of one water particle is smaller than the mass of one vapor particle) is 
chosen by students with low ability (< -2.0 logit), while the curve of choice A (mass of one water particle is bigger 
than mass of one vapor particle) was chosen by students with ability in a range of -3.5 to 1.5 logit. The correct 
answer, option C (mass of one water particle is similar to the mass of one vapor particle), was chosen by students 
with ability in > -2.5 logit. As highlighted in the table, the decline of the curve of distractor A is followed by the 
increase of curve of right answer C; both curves intersect in the level of 1.0 logit. The shape of curve A indicates 
the presence of resistant preconception type-three in level 1. 

It depicts that the particular item response pattern that signifies students’ conceptual understanding patterns 
in the given level. Moreover, the curve shape of distractors A and B in the items Q2/PM/SL and Q8/PM/SL tend to 
have an identical pattern. The finding indicated that students with either low or high ability had consistent precon- 
ceptions that the mass of the particle can change into larger or smaller in size along with the change in matter form. 

Third sample, i.e., item Q5/Dew (-0.63), as shown in Figure 7(a), measures the students’ ability in elaborating 
characteristics of a particle in condensation phenomenon. The option probability curve is displayed in Figure 7(b). 
Students with low ability (< 1.0 logit) tended to pick distractor A (water drops come from liquid of melting ice that 
breaks through the glass wall) and option D (other answers). Some students with high ability (> 1.0 logit) also 
picked distractor B (water drops are the result of the reaction between ice and air nearby the glass). The shape of 
curve B is wavy and non-linear, even in the interval of 2.0 to 4.0 logit, it can reach option probability value up to 
1.0 logit. This is regarded as a deviation from the right answer C (water drops come from condensing water vapor 
nearby the glass). A worth note, however, is to consider in the unstable, wavy shape of curve C. This indicated the 
students’ inconsistency (particularly those with high ability) in comprehending the concept of condensation. This 
confirmed that students had their own preconception regarding concept of condensation. 
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Figure 7 


(a) item Q5/Dew; (b) option probability curve 
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Overall Student Performance 


The fourth sample or item Q6/Bubble, as shown in Figure 8(a), measures the students’ ability in elaborating 
characteristics of a particle in the evaporation phenomenon. The option probability curve is displayed in Figure 
8(b). The distractor A (air bubbles are Hydrogen and Oxygen particles) was dominantly chosen by students whose 
ability in a range between -3.0 to 2.0 logit. Moreover, the distractor B (air bubbles are Hydrogen and Oxygen 
particles) was mostly selected by students whose ability in a range between -3.0 to 0.5 logit. The form of curve A 
and B were picked by students with low ability was predictable. The curve of right answer C (air bubbles are water 
molecules), however, shows interesting hint; in the interval range of -2.5 to 3.0 logit, the tip of the curve shows an 
up-and-down pattern. Moreover, in the level of 1.5 logit, the curve shape of distractors A and B shows a decline 
pattern, while that of curve C tends to increase. Another finding worth noting was that the curve D (other answers) 
was picked by some students with high ability (> 2.0 logit). This indicated that particular students had their own 
preconceptions regarding the evaporation concept. 


Figure 8 
(a) item Q6/Bubble; (b) option probability curve 


(a) (b) 
In a container filled with boiling water, you can see air 
bubbles on the top of it. According to you, what are 
the composing elements of the air bubbles? 

A. Hydrogen and Oxygen particles 
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C. Water molecules 

D. Other answer... 
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Overall Student Performance 
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Discussion 


The research results indicated that the instruments had good effectiveness, met the requisites of person 
and item reliability, and showed good construct validity. When applied in evaluating students’ conceptual 
understanding, it was found that: Firstly, almost all students with high ability faced difficulty in understand- 
ing the concept of particle size and mass in level 1. The same students found it relatively easy in determining 
SMRs diagram of particle structure in level 2 or determining the concept of particle regarding evaporation 
and condensation phenomena in level 3. Secondly, the information of the response pattern of students with 
high ability was quite consistent, repetitive, and systematic in particular items. This indicates the presence of 
permanent and latent preconceptions. The analysis of the option probability curve of item Q2/PM/SL (0.88), Q8/ 
PM/LG (0.65), Q5/Dew (-0.63) and Q6/Bubble (0.60) indicates that the approach of item response pattern is able 
to explore in detail and comprehensively regarding students’ conceptual understanding and preconception. 

Sequences of verification conducted that involves Rasch model approach shows detailed, accurate, and 
quantifiable results since the approach integrates development procedure of diagnostic and summative in- 
struments. Several samples of preconception, e.g., item Q2/PM/SL (0.88) and Q8/PM/LG (0.65) indicate that 
distractor options are potential to be elaborated further in order to investigate tendency of preconception by 
the students. In addition, it also provides information regarding main idea unknown to the students and their 
degree of misunderstanding. 

The approach employed in this research is an effective illustration to help teacher in evaluating the learning 
process as well as the students’ learning progress. This is due to the integration of qualitative item development 
procedure and quantitative data analysis, allowing the teachers to explore in-depth on the students’ under- 
standing, concepts the students understand and/or do not understand, and misconception. Such findings 
echo Herrmann-Abell and Deboer (2016) that the integration of Rasch model analysis and probability curve 
is applicable to diagnose how the students’ misconception turns into their overall conceptual understanding. 
Such an attempt is quite hard to conduct by implementing a conventional approach due to the interdepen- 
dence of person and item. Rasch model, on the other hand, is able to tackle such interdependence, in which the 
item and the test difficulty remain invariant and not dependent on which sample that is involved in the initial 
validation. This signifies that the instrument's items have met the unidimensionality and local independence 
requirements (Jin et al., 2019; Testa et al., 2019; Wei et al., 2012). 

Overall, the research indicated empirical evidence that supported findings by Hoe and Subramaniam 
(2016); Lu and Bi (2016); Rogat et al., (2011), that students had distinctive preconception as a result of a learning 
process they experienced. Such preconception was regarded as the inhibitor to the development process of 
students’ conceptual understanding (Soeharto et al., 2019). In this research, students’ preconception was found 
to be repetitive and systematic in each education level. It signifies that the intervention to change students’ 
preconceptions was difficult to conduct by the conventional learning method. A strategic and meaningful 
learning method is therefore essential to remove students’ incorrect preconceptions and develop scientifically 
correct conceptual understanding. That being said, teachers are demanded to acquire detailed information on 
the forms and characteristics of students’ preconceptions. In conclusion, the item response pattern analysis 
was an efficient and effective means to acquire such information. The information on students’ preconception 
is important as the basis to develop appropriate and measurable instructional design in solving the students’ 
misconception. This is in line with the previous research studies, arguing that the quality of learning progress 
is highly dependent on the students’ learning process and learning experience (Duschl et al., 2011; Park et al, 
2017; Wilson, 2009). 


Conclusions 


The measuring instrument developed performed well in its validity and reliability, thus, it is deemed ap- 
plicable in measuring students’ conceptual understanding and preconception in elaborating particle charac- 
teristics of matter. During the implementation of the instruments, the research finds out that: 

1) almost all students with high ability face difficulty in understanding the concept of particle size 
and mass in level 1. The same students find it relatively easy in determining SMRs diagram of par- 
ticle structure in level 2, as well as determining the concept of particle regarding evaporation and 
condensation phenomena in level 3. 
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2) There is a significant disparity between students’ conceptual understanding based on their educa- 
tional level. 

3) Incertain cases, it is found that the distractor item response pattern by high-ability students tends 
to be consistent, indicating a certain tendency of resistant preconception pattern. 

The development of diagnostic instruments with Rasch model approach is deemed as the literacy process 
for practitioners and researchers in Indonesia. The result indicates that there is no single item that is parallel 
with both the highest ability and lowest ability students. This calls for further elaboration in order to improve 
the instrument items’ quality. Moreover, an anomaly is found that students with high ability (> 1.0 logit) tend 
to pick distractor choices. This urges further studies to investigate structured comprehension problems. The 
research regards that further analysis that integrates conceptual understanding level and items designed in 
a gradual manner is required to define the characteristics of the students’ alternative conception and to mea- 
sure their learning progress. Echoing this notion, one must integrate the item design and basic principles of 
chemistry as a reference for further researchers and educational practitioners to implement the same approach 
conducted in the present research. On top of that, despite not focused on discussing matters regarding stu- 
dents’ learning progress individually, the instrument is expected to be beneficial for the teachers to diagnose 
students’ conception in developing an effective and meaningful learning experience. 
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